The Multi-Node Topological Overlap Measure For Gene Neighborhood Analysis
نویسندگان
چکیده
Defining the neighborhood of an initial set of nodes is an important task in network analysis. For example, we show that the neighborhood of an initial set of brain cancer related genes is highly enriched with other cancer genes as well. It remains an active area of research to define a biologically meaningful concept of neighborhood in gene or protein networks. In gene networks, genes with high topological overlap have been found to have an increased chance of being part of the same biological pathway. Since our main interest lies in gene and protein networks, we propose a generalization of the topological overlap matrix to define the neighborhood of a set of genes. A pair of nodes in a network is said to have high topological overlap if they are both strongly connected to the same group of nodes. We generalize the standard pairwise topological overlap measure to multiple nodes. The resulting neighborhoods are comprised of tightly interconnected nodes. We provide empirical evidence that a neighborhood surrounding an initial set of 2 nodes can be far more informative than the neighborhood of a single node. Using a cancerand a yeast network applications, we provide empirical evidence that the multiTOM approach yields biologically meaningful results and compares favorably to alternative approaches. Our approach is implemented in the freely available multiTOM software package.
منابع مشابه
Network neighborhood analysis with the multi-node topological overlap measure
MOTIVATION The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topolo...
متن کاملThe Generalized Topological Overlap Matrix for Detecting Modules in Gene Networks
Systems biologic studies of gene and protein interaction networks have found that these networks are comprised of ‘modules’ (groups of tightly interconnected nodes). Module identification is an essential step towards understanding the whole network architecture. Here we will focus on module identification methods that are based on using a node dissimilarity measure in conjunction with a cluster...
متن کاملProposal for a Correction to the Temporal Correlation Coefficient Calculation for Temporal Networks
Measuring the topological overlap of two graphs becomes important when assessing the changes between temporally adjacent graphs in a time-evolving network. Current methods depend on the fraction of nodes that have persisting edges. This breaks down when there are nodes with no edges, persisting or otherwise. The following outlines a proposed correction to ensure that correlation metrics have th...
متن کاملFuzzy particle swarm optimization with nearest-better neighborhood for multimodal optimization
In the last decades, many efforts have been made to solve multimodal optimization problems using Particle Swarm Optimization (PSO). To produce good results, these PSO algorithms need to specify some niching parameters to define the local neighborhood. In this paper, our motivation is to propose the novel neighborhood structures that remove undesirable niching parameters without sacrificing perf...
متن کاملParameterized Neighborhood-based Flooding for Ad Hoc Wireless Networks
Flooding is a simple routing technique that can be used to transmit data from one node to every other node in a network. The focus of this paper is to investigate improvements to flooding techniques used in ad hoc wireless networks. Recent work has focused on using topological information to reduce the number of broadcasts. The number of broadcasts necessary to flood the network was the major p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006